智能论文笔记

IoT-Based Pothole Mapping Agent with Remote Visualization

Umar Yahya , Mwaka Lucky , Muhammed Mansoor , Nankabirwa Sharifah , Abdal Kasule , Kasagga Usama

分类：机器人

2022-12-25

Driving through pothole infested roads is a life hazard and economically costly. The experience is even worse for motorists using the pothole filled road for the first time. Pothole-filled road networks have been associated with severe traffic jam especially during peak times of the day. Besides not being fuel consumption friendly and being time wasting, traffic jams often lead to increased carbon emissions as well as noise pollution. Moreover, the risk of fatal accidents has also been strongly associated with potholes among other road network factors. Discovering potholes prior to using a particular road is therefore of significant importance. This work presents a successful demonstration of sensor-based pothole mapping agent that captures both the pothole's depth as well as its location coordinates, parameters that are then used to generate a pothole map for the agent's entire journey. The map can thus be shared with all motorists intending to use the same route.

translated by 谷歌翻译

Fast Parallel Exact Inference on Bayesian Networks: Poster

Jiantong Jiang , Zeyi Wen , Atif Mansoor , Ajmal Mian

分类：人工智能

2022-12-08

Bayesian networks (BNs) are attractive, because they are graphical and interpretable machine learning models. However, exact inference on BNs is time-consuming, especially for complex problems. To improve the efficiency, we propose a fast BN exact inference solution named Fast-BNI on multi-core CPUs. Fast-BNI enhances the efficiency of exact inference through hybrid parallelism that tightly integrates coarse- and fine-grained parallelism. We also propose techniques to further simplify the bottleneck operations of BN exact inference. Fast-BNI source code is freely available at https://github.com/jjiantong/FastBN.

translated by 谷歌翻译

AdsorbML: Accelerating Adsorption Energy Calculations with Machine Learning

Janice Lan , Aini Palizhati , Muhammed Shuaibi , Brandon M. Wood , Brook Wander , Abhishek Das , Matt Uyttendaele , C. Lawrence Zitnick , Zachary W. Ulissi

分类：机器学习

2022-11-29

Computational catalysis is playing an increasingly significant role in the design of catalysts across a wide range of applications. A common task for many computational methods is the need to accurately compute the minimum binding energy - the adsorption energy - for an adsorbate and a catalyst surface of interest. Traditionally, the identification of low energy adsorbate-surface configurations relies on heuristic methods and researcher intuition. As the desire to perform high-throughput screening increases, it becomes challenging to use heuristics and intuition alone. In this paper, we demonstrate machine learning potentials can be leveraged to identify low energy adsorbate-surface configurations more accurately and efficiently. Our algorithm provides a spectrum of trade-offs between accuracy and efficiency, with one balanced option finding the lowest energy configuration, within a 0.1 eV threshold, 86.63% of the time, while achieving a 1387x speedup in computation. To standardize benchmarking, we introduce the Open Catalyst Dense dataset containing nearly 1,000 diverse surfaces and 87,045 unique configurations.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

SUPRA: Superpixel Guided Loss for Improved Multi-modal Segmentation in Endoscopy

Rafael Martinez Garcia-Peña , Mansoor Ali Teevno , Gilberto Ochoa-Ruiz , Sharib Ali

分类：计算机视觉 | 机器学习

2022-11-09

Domain shift is a well-known problem in the medical imaging community. In particular, for endoscopic image analysis where the data can have different modalities the performance of deep learning (DL) methods gets adversely affected. In other words, methods developed on one modality cannot be used for a different modality. However, in real clinical settings, endoscopists switch between modalities for better mucosal visualisation. In this paper, we explore the domain generalisation technique to enable DL methods to be used in such scenarios. To this extend, we propose to use super pixels generated with Simple Linear Iterative Clustering (SLIC) which we refer to as "SUPRA" for SUPeRpixel Augmented method. SUPRA first generates a preliminary segmentation mask making use of our new loss "SLICLoss" that encourages both an accurate and color-consistent segmentation. We demonstrate that SLICLoss when combined with Binary Cross Entropy loss (BCE) can improve the model's generalisability with data that presents significant domain shift. We validate this novel compound loss on a vanilla U-Net using the EndoUDA dataset, which contains images for Barret's Esophagus and polyps from two modalities. We show that our method yields an improvement of nearly 25% in the target domain set compared to the baseline.

translated by 谷歌翻译

Reconstructing Action-Conditioned Human-Object Interactions Using Commonsense Knowledge Priors

Xi Wang , Gen Li , Yen-Ling Kuo , Muhammed Kocabas , Emre Aksan , Otmar Hilliges

分类：计算机视觉 | 自然语言处理

2022-09-06

我们提出了一种从图像中推断人类对象相互作用的不同3D模型的方法。考虑到人类如何与单个2D图像中复杂场景中的对象相互作用的推理是一项具有挑战性的任务，鉴于由于通过投影而导致信息丢失引起的歧义。此外，建模3D相互作用需要对各种对象类别和交互类型的概括能力。我们提出了一种对相互作用的动作条件建模，使我们能够在接触区域或3D场景几何形状上推断人类和物体的不同3D布置。我们的方法从大语言模型（例如GPT-3）中提取高级常识性知识，并将其应用于对人类对象相互作用的3D推理。我们的关键见解是从大语言模型中提取的先验可以帮助从纹理提示中推理人类对象联系人。我们定量评估大型人类对象交互数据集上推断的3D模型，并显示我们的方法如何导致更好的3D重建。我们进一步评估方法对真实图像的有效性，并证明其对互动类型和对象类别的普遍性。

translated by 谷歌翻译

A comprehensive survey on recent deep learning-based methods applied to surgical data

Mansoor Ali , Rafael Martinez Garcia Pena , Gilberto Ochoa Ruiz , Sharib Ali

分类：计算机视觉

2022-09-03

最小的侵入性手术是高度操作员，依赖于冗长的程序时间，导致患者疲劳和风险。为了减轻这些风险，实时系统可以通过提供对场景的清晰了解并避免在操作过程中避免错误估计来帮助外科医生导航和跟踪工具。尽管已经朝这个方向做出了几项努力，但缺乏不同的数据集，并且非常动态的场景及其在每个患者中的可变性都需要实现强大的系统的重大障碍。在这项工作中，我们对最新基于机器学习的方法进行了系统评价，包括手术工具定位，细分，跟踪和3D场景感知。此外，我们提出了这些发明方法的当前差距和方向，并在这些方法的临床整合背后提供了合理的理性。

translated by 谷歌翻译

TempCLR: Reconstructing Hands via Time-Coherent Contrastive Learning

Andrea Ziani , Zicong Fan , Muhammed Kocabas , Sammy Christen , Otmar Hilliges

分类：计算机视觉

2022-09-01

我们介绍了TemPCLR，这是一种针对3D手重建的结构化回归任务的新的时代对比学习方法。与以前的手部姿势估计方法相抵触方法不同，我们的框架考虑了其增强方案中的时间一致性，并说明了沿时间方向的手部姿势的差异。我们的数据驱动方法利用了未标记的视频和标准CNN，而无需依赖合成数据，伪标签或专业体系结构。我们的方法在HO-3D和Freihand数据集中分别将全面监督的手部重建方法的性能提高了15.9％和7.6％，从而确立了新的最先进的性能。最后，我们证明了我们的方法会随着时间的推移产生更平滑的手部重建，并且与以前的最新作品相比，对重型的闭塞更为强大，我们在定量和定性上表现出来。我们的代码和模型将在https://eth-ait.github.io/tempclr上找到。

translated by 谷歌翻译

HTML版本

A semi-supervised Teacher-Student framework for surgical tool detection and localization

Mansoor Ali , Gilberto Ochoa-Ruiz , Sharib Ali

分类：计算机视觉 | 机器学习

2022-08-21

微创手术中的手术工具检测是计算机辅助干预措施的重要组成部分。当前的方法主要是基于有监督的方法，这些方法需要大量的完全标记的数据来培训监督模型，并且由于阶级不平衡问题而患有伪标签偏见。但是，带有边界框注释的大图像数据集通常几乎无法使用。半监督学习（SSL）最近出现了仅使用适度的注释数据训练大型模型的一种手段。除了降低注释成本。 SSL还显示出希望产生更强大和可推广的模型。因此，在本文中，我们在手术工具检测范式中介绍了半监督学习（SSL）框架，该框架旨在通过知识蒸馏方法来减轻培训数据的稀缺和数据失衡。在拟议的工作中，我们培训了一个标有数据的模型，该模型启动了教师学生的联合学习，在该学习中，学生接受了来自未标记数据的教师生成的伪标签的培训。我们提出了一个多级距离，在检测器的利益区域头部具有基于保证金的分类损失函数，以有效地将前景类别与背景区域隔离。我们在M2CAI16-Tool-locations数据集上的结果表明，我们的方法在不同的监督数据设置（1％，2％，5％，注释数据的10％）上的优越性，其中我们的模型可实现8％，12％和27的总体改善在最先进的SSL方法和完全监督的基线上，MAP中的％（在1％标记的数据上）。该代码可在https://github.com/mansoor-at/semi-supervise-surgical-tool-det上获得

translated by 谷歌翻译

Incremental 3D Scene Completion for Safe and Efficient Exploration Mapping and Planning

Lukas Schmid , Mansoor Nasir Cheema , Victor Reijgwart , Roland Siegwart , Federico Tombari , Cesar Cadena

分类：机器人 | 计算机视觉

2022-08-17

对未知环境的探索是机器人技术中的一个基本问题，也是自治系统应用中的重要组成部分。探索未知环境的一个主要挑战是，机器人必须计划每个时间步骤可用的有限信息。尽管大多数当前的方法都依靠启发式方法和假设来根据这些部分观察来规划路径，但我们提出了一种新颖的方式，通过利用3D场景完成来将深度学习整合到探索中，以获取知情，安全，可解释的探索映射和计划。我们的方法，SC-explorer，使用新型的增量融合机制和新提出的分层多层映射方法结合了场景的完成，以确保机器人的安全性和效率。我们进一步提出了一种信息性的路径计划方法，利用了我们的映射方法的功能和新颖的场景完整感知信息增益。虽然我们的方法通常适用，但我们在微型航空车辆（MAV）的用例中进行了评估。我们仅使用移动硬件彻底研究了高保真仿真实验中的每个组件，并证明我们的方法可以使环境的覆盖范围增加73％，而不是基线，而MAP准确性的降低仅最少。即使最终地图中未包含场景的完成，我们也可以证明它们可以用于指导机器人选择更多信息的路径，从而加快机器人传感器的测量值35％。我们将我们的方法作为开源。

translated by 谷歌翻译